feat: Add generic database connection mechanism#754
Conversation
Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.tech>
Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.tech>
| /// The name of the Secret object containing the admin user credentials and database connection details. | ||
| /// Read the | ||
| /// TODO docs | ||
| pub metadata_database: MetadataDatabaseConnection, |
There was a problem hiding this comment.
BIG TODO: Currently this PR adds the field to v1alpha1 and v1alpha2. We need to discuss what our strategy with breaking changes and CRD versioning is.
Just to throw a random idea in: We might want to actually release this as v1alpha3 to not crash existing deployments (which kind of is the point of CRD versioning)
There was a problem hiding this comment.
@maltesander, @Techassi, @NickLarsenNZ and @sbernauer met and decided to move forward as is (Option 1. below):
Context: stackabletech/hive-operator#674
hive-operator: Currently v1alpha1 only
airflow-operator: Currently v1alpha1 and v1alpha2
Problem: We switch from .spec.clusterConfig.database to .spec.clusterConfig.metadataDatabase, which requires the user to use a complex enum of a generic database connection instead of providing a "YOLO" connection string.
Example 1
spec:
clusterConfig:
database:
connString: jdbc:postgresql://postgresql:5432/hive
credentialsSecret: hive-credentials # provides username & password
dbType: postgres # special HMS thing, as it needs to use the correct SQL file for database migrations```
to
```yaml
spec:
clusterConfig:
metadataDatabase: # complex enum
postgresql:
host: postgresql
database: hive
credentialsSecret: hive-credentials```
### Example 2
```yaml
spec:
clusterConfig:
database:
connString: jdbc:derby:;databaseName=/tmp/hive;create=true
credentialsSecret: hive-credentials
dbType: derby
---
apiVersion: v1
kind: Secret
metadata:
name: hive-credentials
type: Opaque
stringData:
username: "DUMMY - THESE IS NEVER READ!"
password: "DUMMY - THESE IS NEVER READ!"```
to
```yaml
spec:
clusterConfig:
metadataDatabase: # complex enum
derby: {} # you can just use an empty object, defaults will work for you
location: /tmp/hive # optional, default to some /tmp/<unique> folder```
### Example 3
In Airflow it's more complicated with Celery, Redis and stuff
```yaml
apiVersion: airflow.stackable.tech/v1alpha1
kind: AirflowCluster
metadata:
name: airflow
spec:
clusterConfig:
credentialsSecret: test-airflow-credentials
---
apiVersion: v1
kind: Secret
metadata:
name: test-airflow-credentials
type: Opaque
stringData:
adminUser.username: airflow
adminUser.firstname: Airflow
adminUser.lastname: Admin
adminUser.email: airflow@airflow.com
adminUser.password: airflow
connections.sqlalchemyDatabaseUri: postgresql+psycopg2://airflow:airflow@airflow-postgresql/airflow
connections.celeryResultBackend: db+postgresql://airflow:airflow@airflow-postgresql/airflow
connections.celeryBrokerUrl: redis://:redis@airflow-redis-master:6379/0```
to
```yaml
apiVersion: airflow.stackable.tech/v1alpha1
kind: AirflowCluster
metadata:
name: airflow
spec:
clusterConfig:
credentialsSecret: airflow-admin-credentials
metadataDatabase:
postgresql:
host: airflow-postgresql
database: airflow
credentialsSecret: airflow-postgresql-credentials
celeryExecutors:
resultBackend:
postgresql:
host: airflow-postgresql
database: airflow
credentialsSecret: airflow-postgresql-credentials
broker:
redis:
host: airflow-redis-master
credentialsSecret: airflow-redis-credentials
---
apiVersion: v1
kind: Secret
metadata:
name: airflow-admin-credentials
type: Opaque
stringData:
adminUser.username: airflow
adminUser.firstname: Airflow
adminUser.lastname: Admin
adminUser.email: airflow@airflow.com
adminUser.password: airflow
---
apiVersion: v1
kind: Secret
metadata:
name: airflow-postgresql-credentials
stringData:
username: airflow
password: airflow
---
apiVersion: v1
kind: Secret
metadata:
name: airflow-redis-credentials
stringData:
username: ""
password: redis```
Meeting minutes/Ideas:
Anyway: We want fallible conversion support in any case :rocket:
1. Just brake v1alpha1 and basically merge as-is
* The entire point of the CRD version is to not break existing versions
+ Well that's what we did for the past ~5 years ^^
+ We could offer a ready-to-use bash script in our documentation
Note: In the case of airflow-operator this means to break all versions (v1alpha1 and v1alpha2), as otherwise we need to convert between v1alpha1 and v1alpha2, which imposes all the discussed problems.
2. Add it in v1alpha2
+ We don't break existing setups
* For all of the above options: We need fallible conversion support in stackable-versioned/stackable-webhook.
* WRONG: ~hive-operator needs to be able to handle v1alpha1 *and* v1alpha2 simultaneously, which makes the Rust code complicated.~
This is wrong, as our operators will always only look at the latest version (v1alpha2 in this case).
Normally this works just fine, in this case a SDP bump from 26.3 to 26.7 will break the **reconciliation** of your stacklets, as the conversion fails.
But this only breaks the reconciliation, the StatefulSets should continue to work.
2.1. Let the /convert HTTP call with 500 "we can't do that for you, here is the docs how to do that (ideally with a ready-to-use bash script)"
2.2. Let helm 26.17 deploy a one-time Kubernetes migration Job that executes a bash script that creates/modifies/deletes Secrets
* A bit dangerous to create and modify/delete production Secrets :O
* E.g. what does Argo CD think of this?
2.3 In some situation (e.g. hive, but NOT airflow/superset) we might be able to return 200 for upgrades and 500 for downgrades
* We need to test how Kubernetes behaves in that situation| .await | ||
| .context(InvalidAuthorizationConfigSnafu)?; | ||
| // We don't have a config file, but do everything via env substitution | ||
|
|
| - host | ||
| type: object | ||
| type: object | ||
| celeryResultBackend: |
There was a problem hiding this comment.
note: Should this be resultBackend and broker instead?
@adwk67 any opinions?
Description
Part of stackabletech/issues#238
Definition of Done Checklist
Author
Reviewer
Acceptance
type/deprecationlabel & add to the deprecation scheduletype/experimentallabel & add to the experimental features tracker